Statistical text-to-speech synthesis with improved dynamics
نویسندگان
چکیده
•Apply a DFT of length dTi to the quasi-periodic sequence: –Harmonic frequencies: k = lTi, l = 1, 2, . . . , d –Non-harmonic frequencies: k + 1, k + 2, . . . , k + Ti − 1, where k = 1, Ti, 2Ti, . . . , (d − 1)Ti •Non-harmonic content (NHC) in statistically generated phonemes is much lower, compared to NHC in natural phonemes. Improving Speech Features Dynamics Learning Non-Harmonic Components Statistics
منابع مشابه
Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques
One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...
متن کاملIRWIN AND JOAN JACOBS CENTER FOR COMMUNICATION AND INFORMATION TECHNOLOGIES Statistical Text-To-Speech Synthesis based on Segment- wise Representation with a Norm Constraint
In statistical HMM-based TTS systems (STTS), speech feature dynamics is modelled by firstand second-order feature frame differences, which, typically, do not satisfactorily represent frame to frame feature dynamics present in natural speech. The reduced dynamics results in over-smoothing of speech features, often sounding as muffled and buzzy synthesized speech. In this work we propose a method...
متن کاملCorpus-based techniques in the AT&t nextgen synthesis system
The AT&T text-to-speech (TTS) synthesis system has been used as a framework for experimenting with a perceptuallyguided data-driven approach to speech synthesis, with primary focus on data-driven elements in the \back end". Statistical training techniques applied to a large corpus are used to make decisions about predicted speech events and selected speech inventory units. Our recent advances i...
متن کاملStatistical prosodic modeling: from corpus design to parameter estimation
The increasing availability of carefully designed and collected speech corpora opens up new possibilities for the statistical estimation of formal multivariate prosodic models. At Apple Computer, statistical prosodic modeling exploits the Victoria corpus, recently created to broadly support ongoing speech synthesis research and development. This corpus is composed of five constituent parts, eac...
متن کاملمراحل و نحوه ی تهیه ی دادگان های صوتی هجایی و دایفونی برای سامانه ی تبدیل متن به گفتار فارسی
Abstract Speech databases are part of the concatenative text to speech synthesis systems. Phonetic quality of the databases plays a significant role in the naturalness of the synthesized speech. This paper introduces two syllable and diphone speech databases for Persian and investigates the way of their development and their specifications and their advantages to each other. ...
متن کامل